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ABSTRACT 

We measure moments of the galaxy count probability distribution function in the 
two-degree field galaxy redshift survey (2dFGRS). The survey is divided into volume 
limited subsamples in order to examine the dependence of the higher order cluster- 
ing on galaxy luminosity. We demons_trate the hierarchical scaling of the averaged 
p-point galaxy correlation functions, ^p, up to p = 6. The hierarchical amplitudes, 
Sp = £,p/^2~^ ^ approximately independent of the cell radius used to smooth the 
galaxy distribution on small to medium scales. On larger scales we find the higher or- 
der moments can be strongly affected by the presence of rare, massive superstructures 
in the galaxy distribution. The skewness 6*3 has a weak dependence on luminosity, 
approximated by a linear dependence on log luminosity. We discuss the implications 
of our results for simple models of linear and non-linear bias that relate the galaxy 
distribution to the underlying mass. 

Key words: galaxies: statistics, cosmology: theory, large-scale structure. 



1 INTRODUCTION 

The pattern of galaxy clustering can be quantified by mea- 
suring the galaxy count probability distribution function 



(CPDF) on a range of smoothing scales. The CPDF gives the 
probability that a randomly chosen region of the universe 
will contain a particular number of galaxies, and is typi- 
cally expressed as a function of both the size of the region 
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smoothed over and the galaxy number within that volume. 
Traditionally, most effort has been directed at measuring 
the second moment of the count distribution, the variance, 
I2, through the autocorrelation function or, equivalently, its 
Fourier transform, the power spectrum (e.g. Percival et al. 
2001; Padilla & Baugh 2003; Tegmark et al. 2004). The 
higher order moments of the CPDF, expressed as volume 
averaged correlation functions, {p — 2,3, . . .), provide a 
much more detailed description of galaxy clustering, probing 
the shape of the low and high count tails of the distribution. 

The higher order moments of the dark matter distri- 
bution are known to display a hierarchical scaling in which 
the p-point volume averaged correlation functions, ^p, can 
be written in terms of the variance of the count distribution, 
^2: — 'S'pCl^^ (c-g- see Peebles 1980, Juszkiewicz, Bouchet 
& Colombi 1993, Bernardeau 1994, Baugh, Gaztanaga & 
Efstathiou 1995, Gaztanaga & Baugh 1995, Fosalba & 
Gaztanaga 1998). This scaling is a signature of the evolution 
under gravitational instability of an initially Gaussian dis- 
tribution of density fluctuations. A remarkable feature of the 
scaling is that the values of the hierarchical amplitudes, Sp, 
on scales for which the density field evolves linearly or in a 
quasi-linear fashion, are insensitive to cosmic epoch and es- 
sentially independent of the cosmological density parameter 
or the value of the cosmological constant. For a comprehen- 
sive review of such results see Bernardeau et al. (2002) and 
references therein. 

Departures from the hierarchical scaling of the higher 
order moments could conceivably arise in three ways: 

(i) A strongly non-Gaussian distribution of primordial den- 
sity waves as could arise, for instance, due to a seed non- 
linear fluctuation such as a global texture (see Gaztanaga & 
Mahonen 1996, Gaztanaga & Fosalba 1998, Scoccimarro, Se- 
fusatti & Zaldarriaga 2003 for examples of how the Sp scale 
in this case). This avenue now seems unlikely, following the 
clear detection of multiple acoustic peaks in the power spec- 
trum of cosmic microwave background temperature fluctu- 
ations (Netterfield et al. 2002; Hinshaw et al. 2003; Mason 
et al. 2003; Scott et al. 2003; Kuo et al. 2004); such peaks 
are difficult to reconcile with models that include cosmo- 
logical defects (Kamionkowski & Kowsowsky 1999). More- 
over strongly non-Gaussian primordial fluctuations are ruled 
out by the first year WMAP results (Komatsu et al. 2003; 
Gaztanaga & Wagg 2003) 

(ii) A weakly non-Gaussian distributed primordial density 
field, resulting from a non-linear perturbation to a Gaussian 
density field. This scenario is difficult to distinguish from the 
evolution of an initially Gaussian field under gravitational 
instability, because the perturbation can introduce a shift to 
the amplitudes Sp that is also hierarchical. This can happen 
even in the case where the non-linear perturbation produces 
a negligible effect on the power spectrum (Bernardeau et al. 
2002). 

(iii) The spatial bias between the galaxy distribution and 
the underlying distribution of dark matter. Fry & Gaztanaga 
(1993) demonstrated that, under a local biasing prescrip- 
tion, the hierarchical scaling of the higher order moments is 
preserved but the amplitudes ^j, can change as a function 
of time or luminosity. This conclusion is also reached us- 
ing more sophisticated, physically motivated semi-analytic 



models of galaxy formation (Kauffmann et al. 1999; Benson 
et al. 2000; Scoccimarro et al. 2001). 

Previous attempts to measure the higher order corre- 
lation functions have been hamstrung by the small size of 
the available redshift surveys, a shortcoming that is exacer- 
bated once volume limited subsamples are constructed (Hui 
& Gaztafiaga 1999). Nevertheless, early counts-in-cells stud- 
ies established that the first few higher order moments of 
the galaxy distribution displayed the hierarchical scaling ex- 
pected in the gravitational instability framework (Groth & 
Peebles 1977; Peebles 1980; Gaztanaga 1992; Bouchet et al. 
1993; Fry & Gaztanaga 1994, Ghigna et al. 1996, Feldman et 
al. 2001). Such analyses were typically limited to measuring 
the three and four point correlation functions. The nature 
of the dependence of the hierarchical amplitudes on lumi- 
nosity has not been convincingly established. Recent work 
to investigate this in the optical (Hoyle et al. 2000) and in 
the far infrared (Szapudi et al. 2000) was restricted to prob- 
ing fairly narrow ranges of luminosity due to the size of the 
redshift surveys then available. 

The advent of multi-fibre spectrographs exploited by 
sustained observing campaigns has led to a new generation 
of redshift survey which represents order of magnitude ad- 
vances over surveys completed in the last millennium. The 
Sloan Digital Sky Survey (York et al. 2000) and the Two- 
degree Field Galaxy Redshift Survey (2dFGRS, CoUess et 
al. 2001) have provided maps of the clustering pattern of 
galaxies with unprecedented detail. Analysis of the 2dFGRS 
clustering has suggested that the flux limited sample could 
be an essentially unbiased tracer of the dark matter in the 
Universe (Lahav et al. 2002; Verde et al. 2002) ^ These re- 
sults confirmed previous deductions about galaxy bias (e.g. 
Gaztanaga 1994, Frieman & Gaztafiaga 1999, Gaztanaga & 
Juszkiewicz 2001) reached using the parent angular cata- 
logue of the 2dFGRS, the APM Galaxy Survey (Maddox 
et al. 1990, 1996). The 2dFGRS covers a volume that is 
an appreciable fraction of that sampled by the APM Sur- 
vey, with full redshift coverage (modulo the relatively small 
redshift incompleteness that still remains). This means that 
for the first time, a measurement of the higher order mo- 
ments is possible in three dimensions with comparable ac- 
curacy to that attainable in two dimensions, but without the 
added complication of the effects of projection (Bernardeau 
& Gaztanaga 1996; Szapudi & Gaztafiaga 1998). 

The sheer number of galaxies in the 2dFGRS allows it 
to be subdivided in order to probe the dependence of the 
clustering signal on intrinsic galaxy properties in more de- 
tail. Norberg et al. (2001) found that the amplitude of the 
projected two point correlation function scales with luminos- 
ity, and characterised this trend using a relative bias factor 
with a linear dependence on luminosity. In this paper we 
extend the work of Norberg et al. to study the higher order 
clustering of galaxies in the 2dFGRS and its dependence on 
luminosity. Our approach is the same as that followed in 
Baugh et al. (2004), who measured the higher order corre- 
lation functions of a sample of L* galaxies and found that 
they follow a hierarchical scaling. 

^ Note that with the weighting scheme adopted to compensate 
for the radial selection function, the characteristic luminosity of 
the flux limited 2dFGRS used in these studies is 2L* . 
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We provide a brief review of the measurement of the mo- 
ments of the CPDF in Section 2. In Section 3, wo discuss the 
specific application of this method to the 2dFGRS; an im- 
portant feature of our analysis is the use of mock catalogues 
to estimate the errors on our measurements (sec Section 3.3). 
Our results for the higher order correlation functions and the 
hierarchical amplitudes are given in Section 4. We quantify 
the variation of the higher order moments with luminosity 
in Section 5, and discuss the interpretation of these results 
in terms of a simple relative bias model. Our conclusions 
are set out in Section 6. Throughout, we adopt standard 
present day values of the cosmological parameters to com- 
pute comoving distance from redsliift: a density parameter 
fim = 0.3 and a cosmological constant Qa = 0.7. 



2 COUNT-IN-CELLS STATISTICS 

The count probability distribution function (CPDF) and its 
moments have been used extensively to quantify the clus- 
tering pattern of galaxies (e.g. White 1979; Peebles 1980). 
In this Section we give an outline of the counts-in-cells ap- 
proach, explaining how the volume averaged p-point corre- 
lations are derived from the CPDF and give a brief theo- 
retical background. A more comprehensive discussion of the 
counts-in-cells approach can be found in Bernardeau et al. 
(2002). 



mj,{R) = {iN-Nr} = Y,PN{R}{N-Nr , (3) 

JV=0 

where N is the mean number of galaxies in a cell of volume 
V and is calculated directly from the CPDF 

oo 

TV = ^ NPm . (4) 
Ar=o 

For the case of a continuous distribution, is related to the 
corresponding cumulant, jip, through N''^p = Hp, where the 
cumulants are defined as (see Gaztaiiaga 1994 for details): 

H2 = m2 ; //3 = ma , 

Hi = rrii — 3mi ; //5 = ms — Wm3m2 ■ (5) 

If instead we are dealing with a discrete distribution, these 
relations must be corrected. A Poisson shot noise model is 
adopted (see Baugh et al. 1995 for a discussion of this point), 
to give corrected estimates of the moments, kp-. 

k2 = H2— N ; k3 = H3 — 3/c2 — N , 

k4 = Hi ~ 7^2 — 6^3 — N , 

k5 = H5- 15A;2 - 25fe3 - 10^4 - N . (6) 

The volume- averaged correlation functions, calculated from 
the galaxy CPDF, follow directly from the relation ^p = 
kp/N". 



2.1 Estimating the p-point volume averaged 
correlation functions 

The p-point moment, or (un-reduced) correlation function, 
rn3{ri,r2,r3) =< 5{ri)...S{rp) >, can be used to fully char- 
acterise the clustering of a fluctuating field 5{r). The reduced 
p-point correlation function, ^p(ri, rp), is defined as the 
connected part of the above p-point correlation in such a way 
that for p > 2: ^p = for a Gaussian field (see Bernardeau 
et al. 2002 for more details). Following the standard con- 
vention, for the remainder of this paper when we talk about 
correlations we will always assume they are "reduced" cor- 
relations. 

The p-point volume averaged galaxy correlation func- 
tion, ^p{V), can be written as the integral of the p-point 
correlation function, ^p, over the sampling volume, V (Pee- 
bles 1980): 



ri ... d rp ^p(ri. 



(1) 



A practical way in which to estimate Cp(^) is to randomly 
throw cells down within the galaxy distribution, recording 
the number of times a cell contains N galaxies so as to build 

up the galaxy CPDF, Pjv(V'). Since wo adopt spherical cells, 
the CPDF is a function of the sphere radius, -R, 



Pn{R) = 



Nn 
Nt 



(2) 



where A'"jv is the number of cells that contain A'^ galaxies out 
of a total number of cells thrown down, Nt- The volume 
averaged correlation functions ^p{V) are then related to the 
moments of the CPDF, nip: 



2.2 Scaling of the higher order moments 

In the hierarchical model of clustering, all higher-order cor- 
relations can be expressed in terms of the 2-point function, 
^2, and dimensionless scaling coefficients, Sp-. 



^p — Sp ^2 



(7) 



Traditionally, S3 = 1^3 is referred to as the skewness of the 
distribution and 5*4 — ^4/^2 a-s the kurtosis. The hierarchical 
scaling of the higher order moments arises from the evolu- 
tion due to gravitational instability of an initially Gaussian 
distribution of density fiuctuations (see Bernardeau et al. 
2002 and references therein). 



2.3 Systematic effects: biased estimators 

In addition to sampling errors (see Section 3.3 below), the es- 
timation of the hierarchical amplitudes can be compromised 

by systematic effects, as discussed in some detail by Hui & 
Gaztanaga (1999). These authors identified two sources of 
error that could lead to a systematic bias in the inferred val- 
ues of Sp. The first effect arises from biases in the estimates 
of the higher order correlation functions themselves, known 
as the "integral constraint bias" (see e.g. Bernstein 1994). 
The second effect originates in the nonlinear combination of 
^p and ^2 to form Sp; this is called the "ratio bias". The lat- 
ter effect dominates on large scales and tends to cause the 
inferred values of the Sp to be biased low. Hui & Gaztanaga 
wrote down expressions for these biases which accurately 
reproduce the systematic effects seen upon estimating the 
hierarchical amplitudes from sub-volumes extracted from N- 
body simulations. 
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As mentioned above, we will use different volume lim- 
ited samples to study the luminosity dependence of the hi- 
erarchical amplitudes, Sp. As the luminosity that defines a 
sample is made brighter, the volume of the sample increases. 
Thus the estimation biases tend to cause the Sp to increase 
with sample luminosity. This spurious tendency has already 
been reported in the literature (see Hui & Gaztafiaga 1999) . 
For volumes of the size used in our analysis, it turns out 
that the predicted biases are smaller than the correspond- 
ing sampling errors (e.g. see figure 3 in Hui & Gaztafiaga 
1999). This is the first time that a redshift survey has been 
available which is large enough to overcome such systematic 
biases. 



2.4 Galaxy biasing 

Galaxy samples constructed using different selection crite- 
ria display different clustering patterns. This leads one to 
the conclusion that distinct samples of galaxies must trace 
the underlying mass distribution in different ways, a phe- 
nomenon that is generally known as galaxy bias. 

A simple, heuristic scheme describing the impact of a 
local bias on the scaling of the higher order moments was 
proposed by Fry & Gaztafiaga (1993). These authors demon- 
strated that in this case, the scaling of the higher order 
moments of the galaxy distribution should mirror that of 
the dark matter, though possibly with different values for 
the hierarchical amplitudes Sp. Fry & Gaztafiaga made the 
assumption that the density contrast in the galaxy distri- 
bution, S'^ , i.e. the fractional fluctuation around the mean 
density, could be written as a Taylor expansion of the den- 
sity contrast of the dark matter, S^^: 



Ebk , c-DMNfe 



(8) 



On scales where the variance, ^ is small, the leading 
order contribution to the two-point volume averaged corre- 
lation function of galaxies has the form: 



?2 



,2 J-DM 



(9) 



where h\ is the ubiquitous linear bias h. The leading order 
forms for the hierarchical amplitudes, Sp, for the cases p = 3 
and p = 4 are: 



1 



■S3°^'+3C2) 



4C3 



124) 



(10) 
(11) 



where we use the notation Ck = 6fe/&i. Expressions for the 
hierarchical amplitudes are given up to p = 7 in Fry & 
Gaztafiaga (1993). 

Mo, Jing & White (1997) give theoretical predictions 
for the coefficients bk using the Press & Schechter (1974) 
formalism and exploiting the framework developed by Cole 
& Kaiser (1989) and Mo & White (1996). For halos of mass 
M, the first two bias factors (fc = 1 and 2) are given by: 



&i = 1 + 



3) 



(12) 



(13) 



where v = 5c/a{M), 5c is the linear theory overdensity at 
the time of collapse {5c = 1.686 for Q, = 1) and a{M) is the 
linear rms fluctuation on the mass scale of the halos. This is a 
simple model but nevertheless it shows some tendencies that 
are correct. For example, a typical mass halo corresponding 
to = 1 displays an unbiased variance with 6i = 1, but 
introduces a bias in the skewness, since C2 = ti2 = —0.7. As 
a further illustration, consider massive halos defined by > 
3; in this case the Mo, Jing & White theory predicts that 
C2 > 0, while less massive halos could produce C2 < 0. To get 
more realistic values of bk for galaxy bias, a prescription has 
to be adopted for populating dark matter halos of a given 
mass with galaxies of a given luminosity (Benson et al. 2001; 
Scoccimarro et al. 2001; Berlind et al. 2003). 

In the interpretation of the higher order moments pre- 
sented in this paper we will make use of a relative bias, which 
describes the change in clustering compared with that mea- 
sured for a reference sample (Norberg et al. 2001, 2002a). Us- 
ing Eq.El as a guide, we define the relative bias, b,. — bi/bl, 
of a sample as the square root of the ratio of the 2-point 
correlation function measured for the sample over that mea- 
sured for the reference sample, denoted by an asterisk (the 
reference sample will be defined explicitly in Section 5): 



br = 



bl \it 



1/2 



(14) 



Thus, we can obtain an estimate of the relative bias from 
the ratio of the variances. 

When the linear bias is a good approximation [ck — 
for k > 1), we can relate Sp in different galaxy samples 
regardless of the underlying DM value of Sp: 



St) 



(15) 



More generally, one can manipulate Eq. 10 to write down 
an expression comparing 5*^ for two galaxy samples, elimi- 
nating S^^ for the underlying dark matter (e.g. see Eq. 9 
in Fry & Gaztafiaga 1993). For the skewness: 

^3^* = brS^ - 3^^^^ , (16) 

where an asterisk denotes a quantity describing the reference 
sample, and br = bi/b\ is the relative bias defined above. 
Any second order relative bias effects are thus given by: 



c.-^^^^^ = \{brs^-sr 

bl 6 ^ 



(17) 



As a special case, if the reference sample is un-biased (i.e. 
b\ = 1 and c* = 0), we then have C2 = C2. 



3 APPLICATION TO THE 2dFGRS 

In this Section we describe the construction of volume lim- 
ited samples from the 2dFGRS (Section 3.1) and outline 
how we deal with the small, remaining incompleteness of 
the survey when we measure the CPDF (Section 3.2). The 
estimation of errors on the measured higher order moments 
is described in Section 3.3. We use the full 2dFGRS as our 
starting point. The final spectra were taken in April 2002, 
giving a total of 221,414 galaxies with high quality redshifts 
(i.e. with quality fiag Q > 3; see CoUess et al. 2001). The 
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Table 1. Properties of the combined 2dFGRS SGP and NGP volume-limited catalogues (VLCs). Column 1 gives the numerical label of 
the sample. Columns 2 and 3 give the faint and bright absolute magnitude limits that define the sample. The fourth column gives the 
median luminosity of each volume limited sample in units of L,, computed using the Schechter function parameters quoted by Norberg 
et al. (2002b). Columns 5, 6 and 7 give the number of galaxies, the mean number density and the mean inter-galaxy separation for 
each VLC, respectively. Columns 8 and 9 state the redshift boundaries of each sample for the nominal apparent magnitude limits of the 
survey; columns 10 and 11 give the corresponding comoving distances. Finally, column 12 gives the combined SGP and NGP volume. 
All distances are comoving and are calculated assuming standard cosmological parameters (f2m = 0.3 and = 0.7). 



VLC 
ID 


Mag. range 


Median lum. 
L/L* 


Ng 


Pave 
10~^h^Mpc~^ 


dmcan 


Zmin 


Zmax 


Dmin 
h~lMpc 


Dm ax 
/i~^Mpc 


Volume 

10^h~^Mpc^ 


1 


-17.0 


-18.0 


0.13 


8038 


10.9 


4.51 


0.009 


0.058 


24.8 


169.9 


0.74 


2 


-18.0 


-19.0 


0.33 


23290 


9.26 


4.76 


0.014 


0.088 


39.0 


255.6 


2.52 


3 


-19.0 


-20.0 


0.78 


44931 


5.64 


5.62 


0.021 


0.130 


61.1 


375.6 


7.97 


4 


-20.0 


-21.0 


1.78 


33997 


1.46 


8.82 


0.033 


0.188 


95.1 


537.2 


23.3 


5 


-21.0 


-22.0 


3.98 


6895 


0.110 


20.9 


0.050 


0.266 


146.4 


747.9 


62.8 



Table 2. The best fit values and 2— a error (Ax^ = 4) for Sp (columns 4 to 7). The range of scales used in the fits is given in columns 2 
and 3. The number in brackets after each error gives the reduced value for the fit, using the number of degrees of freedom derived from 
the principal component analysis. The last two columns give the relative linear bias, br (defined by Eq. 1141 and the second order bias 
term, (defined by Eq. 17). The reference sample is sample number 3. These values are obtained for the full volume limited samples. A 
blank entry indicates that a reliable measurement of the particular hierarchical amplitude was not possible for the sample in question. 



VLC 




fimax 


53 


54 


55 


56 


br 


4 


ID 


















1 


0.71 


7.1 


2.58 ±0.37 (0.1) 


9.3 ±4.0 (0.1) 


34 ±32 (0.1) 




0.96 ±0.16 (0.1) 


0.17 ±0.25 (0.1) 


2 


0.71 


7.1 


2.38 ±0.25 (0.1) 


8.2 ±2.3 (0.9) 


36 ± 20 (0.4) 


185 ± 170(0.1) 


0.96 ±0.08 (0.3) 


0.11 ±0.13 (0.1) 


3 


0.71 


7.1 


1.95 ±0.18 (6.1) 


5.5 ± 1.4(2.3) 


18± 11 (1.9) 


46 ±50(1.1) 


1 





4 


0.80 


8.9 


2.01 ±0.17 (1.2) 


6.0 ± 1.5(0.6) 


22 ± 12 (0.4) 


71 ±80(0.3) 


1.13 ±0.06 (2.8) 


0.10 ±0.08 (0.3) 


5 


2.2 


11.2 


2.39 ±0.63 (0.5) 


6.8 ±7.0 (0.4) 






1.30 ±0.14 (0.9) 


0.33 ±0.31 (0.5) 



median depth of the full survey, to a nominal magnitude 
limit of foj ~ 19.45, is z ~ 0.11. We consider the two large 
contiguous survey regions, one near the south galactic pole 
(SGP) and the other towards the north galactic pole (NGP). 
After restricting attention to the high redshift completeness 
parts of the survey (see CoUess et al. 2001; Norberg et al. 
2002b), the effective solid angle covered by the NGP region 
is 469 square degrees and that of the SGP is 670 square de- 
grees. Full details of the 2dFGRS and the construction and 
use of the mask quantifying the completeness of the survey 
can be found in Colless et al. (2001, 2003). 

We make use of mock 2dFGRS catalogues to test our 
algorithm for dealing with the spectroscopic incompleteness 
of the survey and to estimate errors on the measured higher 
order correlation functions. The construction of the mocks is 
described in Norberg et al. (2002b). In short, catalogues are 
extracted from the Virgo Consortium's ACDM Hubble Vol- 
ume simulation which covers a volume of 27Gpc'' (Evrard et 
al. 2002). A heuristic bias scheme is applied to the smoothed 
distribution of dark matter in the simulation to select 'galax- 
ies' with a specified clustering pattern (Cole et al. 1998). The 
parameters of the biasing scheme are adjusted so that the 
extracted galaxies have the same projected correlation func- 
tion as measured for the flux limited 2dFGRS by Hawkins et 
al. (2003). Observers are placed within the Hubble Volume 
simulation according to the criteria set out in Norberg et al. 
(2002b). Mock catalogues are then extracted by applying 
the radial and angular selection functions of the 2dFGRS. 
Finally, the mock is degraded from uniform coverage within 



the angular mask of the survey by applying the spectroscopic 
completeness mask of the 2dFGRS. 



3.1 Construction of volume limited catalogues 

In a flux limited sample the density of galaxies is a strong 
function of radial distance. This effect needs to be taken into 
account in clustering analyses (for an example of a technique 
appropriate to a counts-in-cells analysis, see Efstathiou et 
al. 1990). Alternatively, one may construct volume limited 
samples in which the radial selection function is constant 
and any variations in the density of galaxies are due only 
to large scale structure. This greatly simplifies the analysis 
at the expense of using a subset of the survey galaxies. The 
2dFGRS contains enough galaxies and covers sufficient vol- 
ume to permit the construction of volume limited samples 
corresponding to a wide baseline in luminosity from which 
robust measurements of the higher order correlation func- 
tions can be obtained. 

We follow the approach taken by Norberg et al. (2001, 
2002a) who measured the projected 2-point correlation func- 
tion of 2dFGRS galaxies in volume limited samples corre- 
sponding to bins in absolute magnitude. The samples are 
defined by a specified absolute magnitude range, with abso- 
lute magnitudes corrected to zero redshift (this correction 
is made using the k + e correction given by Norberg et al. 
2002b). As any survey has, in practice, a bright as well as a 
faint flux limit, this implies that a selected galaxy should fall 
between a minimum (zmin) and a maximum (zmax) redshift. 
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Figure 1. A test of the scheme used to correct the measured 
distribution of counts-in-cells for incompleteness in the 2dFGRS, 
using mock catalogues. The plot shows the hierarchical ampli- 
tudes, Sp, for orders p = 3,4 and 5. The dotted lines show the 
results from fully sampled mocks that have no incompleteness. 
The dashed lines show how these results change when the com- 
pleteness mask of the 2dFGRS is applied to the mocks and no 
compensation is made for the variable spectroscopic complete- 
ness. The circles show the Sp recovered once the correction to the 
cell radius discussed in the text is made. The errorbars show the 
rms scatter estimated from the mock catalogues. 



This then guarantees that all sample galaxies are visible 
within the flux limits of the survey when displaced to any 
depth within the volume of the sample. The properties of the 
combined NGP and SGP volume limited samples examined 
in this paper are given in Table 1. 



3.2 Correcting for incompleteness 

There are two possible sources of incompleteness that need 
to be considered when estimating the galaxy count within 
a cell. The first is volume incompleteness, which can arise 
when some fraction of the cell volume samples a region of 
space that is not part of the 2dFGRS. This situation can 
arise because the survey has a complicated boundary and 
also because it contains holes excised around bright stars 
and other interlopers in the parent APM galaxy catalogue 
(Maddox et al. 1990, 1996). The second source of incom- 
pleteness is spectroscopic incompleteness. The final 2dFGRS 
catalogue is much more homogeneous than the 100k release 
(contrast the completeness mask of the final survey shown 
in figure 1 of Hawkins et al. 2003 with the equivalent mask 
depicted in figure 15 of CoUess et al. 2001.) However, the 
spectroscopic completeness still varies with position on the 
sky and needs to be incorporated into the counts-in-cells 
analysis. 

It is therefore necessary to devise a strategy to compen- 
sate for the fact that a cell will sample regions that have 



varying spectroscopic completeness and which may even 
straddle the survey boundary or a hole. We project the vol- 
ume enclosed by the cell onto the sky and estimate, using 
the survey masks, the mean combined spectroscopic and vol- 
ume completeness, /, within the sphere. Rather than view 
the consequence of this incompleteness as missed galaxies, 
we instead consider it as missed volume. We compute a new 
radius for the sphere given by R' = f~3R; such a sphere 
with radius R' will have an incomplete volume equivalent to 
that of a fully complete sphere of radius R. Spheres for which 
/ is less than 50% are discarded. The galaxy count within 
the sphere of radius R' then contributes to the CPDF at the 
effective radius R. Each sphere thrown down is individually 
scaled in this way according to its local incompleteness, as 
given by the survey masks. We note that, due to our chosen 
acceptable minimum completeness of 50%, the rescaling of 
the cell radius is always less than the width of the radial 
bins we use to plot the higher order correlation functions. 
Our results are insensitive to the precise choice of complete- 
ness threshold. 

An alternative method to correct cell counts is described 
in Efstathiou et al. (1990). In this commonly used approach 
it is the galaxy counts which are scaled up in proportion 
to the degree of incompleteness in the cell, as opposed to 
the cell volume as we have done. We have tried both cor- 
rection methods when calculating the higher order moments 
and find the results are essentially identical (see Croton et 
al. 2004 for further discussion of the relative strengths and 
weaknesses of both methods) . 

A test of our method for dealing with incompleteness is 
shown in Fig. Q This plot shows the Sp estimated from the 
higher order correlation functions measured in mock 2dF- 
GRS catalogues (Norberg et al. 2002b). The dotted lines 
show the results for complete mocks, with uniform sampling 
of the galaxy distribution within the full angular boundary 
of the 2dFGRS. The dashed lines show how these results 
change once the mocks are degraded to mimic the spectro- 
scopic incompleteness and irregular geometry of the 2dF- 
GRS, without applying any correction to compensate for 
this incompleteness. The circles show the values of Sp re- 
covered on application of the correction for incompleteness 
described above. These results are in excellent agreement 
with those from the fully sampled, 'perfect' mocks. 

We have carried out two independent counts-in-cells 
analyses, using different algorithms to place cells within the 
survey volume. The results are insensitive to the details of 
the counts-in-cells algorithm. The CPDF is measured using 
2.5 X 10^ cells for each cell radius. We have further checked 
the counts in cells analysis by comparing the measured two 
point volume averaged correlation function with the integral 
of the measured spatial two point correlation function, given 
by Eq.0 the integral of the spatial correlation function is in 
very good agreement with the direct estimate of the volume 
averaged correlation function. 



3.3 Error Estimation 

We estimate the error on the higher order correlation func- 
tions and hierarchical amplitudes using the set of 22 mock 
2dFGRS surveys described by Norberg et al. (2002b). The 
la errors that we show on plots correspond to the rms scat- 
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Figure 2. The upper panel shows the skewness, 53, recovered 
from mock 2dFGRS catalogues in volume limited samples defined 
by the magnitude range — 19 > M^^ — 5 logj^g h > —20. The dotted 
lines show the skewness measured in each catalogue. The points 
show the mean skewness. The errorbars show the mean rms scat- 
ter averaged over 22 mocks, as described in the text in Section 
13.31 The lower panel shows the fractional error as a function of cell 
radius. This panel shows how well we can expect to measure the 
skewness in a catalogue of this size extracted from the 2dFGRS, 
including the contribution from sampling variance. 



ter over the ensemble of mocks (see Norberg et al. 2001a). 
To recap, we consider one of the mocks as the "data" and 
compute the variance around this "mean" using the remain- 
ing mock catalogues. This process is repeated for each mock 
in turn, and the rms scatter is taken as the mean vari- 
ance. We illustrate this approach in Fig. |5| for the case 
of p = 3, for a volume defined by the magnitude range 
— 19 > A/fcj — Slogj^fl h > —20. In the upper panel, the skew- 
ness or 53 measured in each mock is shown by the dotted 
lines. The points show the mean skewness averaged over the 
ensemble of 22 mocks. The errorbars show the rms scatter on 
these measurements. The lower panel shows the fractional 
error that we expect on the measurement of S3 for this par- 
ticular volume limited sample. Beyond 20/i~^Mpc, the frac- 
tional error increases rapidly. Our estimate of the fractional 
error automatically includes the contribution from sampling 
variance due to large scale structure (sometimes referred to 
as "cosmic variance" ) . To estimate the error on a measured 
correlation function, we simply compute the fractional rms 
scatter for the equivalent volume limited sample using the 
ensemble of mocks, and multiply the measured quantity by 
the fractional error. 

We have compared the estimate of the rms scatter from 
the ensemble of mocks with an internal estimate using a jack- 
knife technique (see, for example, Zehavi et al. 2002). In the 
jackknife approach, the survey is split into subsamples. The 
error is then the scatter between the measurements when 
each subsample is omitted in turn from the analysis. The 
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Figure 3. The higher order correlation functions measured for 
2dFGRS galaxies. The symbols show the measurements for galax- 
ies in the absolute magnitude range —19 > M^^ — 5 logjQ h > —20; 
the key gives the order p. The lines show the results for differ- 
ent luminosity samples; the dashed lines show the for galaxies 



with -18 > Mi, 



I h > —19 and the dotted lines show the 



results for galaxies with —20 > Afi,j — Slogj^Q h > 



-21. 



jackknife gives comparable errors to the mock ensemble for 
low order moments. On large scales, the higher order mo- 
ments are particularly sensitive to sample variance and, in 
these cases, the jackknife approach can only provide a lower 
bound to the true scatter. 

A more formal error estimation procedure is adopted 
when computing the best fit values for the hierarchical am- 
plitudes, Sp. In this case, we employ a principal component 
analysis to explicitly take into account the correlations be- 
tween the Sp inferred at difi'erent cell radii (see e.g. Porciani 
& Giavalisco 2002 and Section 6 of Bernardeau et al. 2002). 
The mock catalogues are used to compute the full covariance 
matrix of the Sp data points to be fitted. Next, the eigen- 
values and eigenvectors of the covariance matrix are deter- 
mined. We find that, typically, the first few eigenvectors are 
responsible for over 90% of the variance. Given the number 
of data points that we consider in the fits, this means that 
we have around a factor of two to three times fewer inde- 
pendent points than data points fitted. (Details of the range 
of scales used in the fits will be given in Section 4.2.) We 
note that in most previous work, the Sp measured at dif- 
ferent cell radii were simply averaged together ignoring any 
correlations between bins, resulting in unrealistically small 
errors in the fitted values. 



4 RESULTS 

4.1 Volume-averaged correlation functions 

The volume averaged correlation functions estimated from 
the CPDF constructed from the combined NGP and SGP 
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Figure 4. The dependence of the higher order correlation func- 
tions on luminosity. The orders p = 3 (top panel) and p = 2 (bot- 
tom panel) are shown. The correlation functions for samples of 
different luminosity are divided by the correlation function mea- 
sured for L, galaxies, with — 19 > Mf,j — SlogjQ h > —20. 



cell counts are plotted in Fig. |2l The symbols show the cor- 
relation functions for the L* sample with —19 > Mt,j — 
51og]^Q h > —20. The lines show the measurements made for 
galaxies in magnitude bins adjacent to L« (the dashed lines 
correspond to a sample that is one magnitude fainter and the 
dotted lines to a sample that is one magnitude brighter) . The 
correlation functions steepen dramatically on small scales as 
the order p increases. 

To better quantify the dependence of the higher order 
correlation functions on luminosity, we plot the ratio of the 
to the results for the L« reference sample in Fig. 21 for 
the cases p — 2 and p = 3. The variance in the distribu- 
tion of counts-in-cells on a given smoothing scale increases 
with the luminosity of the volume limited sample (see the 
bottom panel of Fig. |3J. This effect is similar to that re- 
ported by Norberg et al. (2001, 2002a), who measured the 
dependence of the strength of galaxy clustering on luminos- 
ity in real space, whereas our results are in redshift space. 
This behaviour is broadly seen to extend to the higher or- 
der clustering, however the ranking of the amplitude of the 
higher order correlation functions with luminosity is not al- 
ways preserved on large scales. This issue is investigated 
further in Section f4. 31 



4.2 Hierarchical clustering 

We use the measured volume averaged correlation functions 
from Fig.|3]to test the hierarchical clustering model set out 
in Section 2.2 and Eq.|7| In Fig.|21 we plot the p — 3-6 point 
volume averaged correlation functions as a function of the 
variance (or two-point function) measured on the same scale. 
Small values of the moments correspond to large cells. The 
thick grey lines show the higher order moments expected in 
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Figure 5. The volume averaged correlation functions, ^p, for 
p = 3 to 6, plotted as a function of the variance, §2- Each panel 
corresponds to a different order plotted on the ordinate, as in- 
dicated by the legend. (Note that fs and are not plotted for 
the brightest sample, as they are too noisy.) The symbols refer to 
different magnitude ranges as given by the key in the first panel. 
The line styles denote the results for different absolute magnitude 
ranges, as indicated by legend. The thick grey lines show power- 
laws with slopes of 2, 3, 4 and 5 in order of increasing amplitude, 
which arc intended to act as a reference. 



the hierarchical model. (From Eq.|2| the offsets of these lines 
are the hierarchical amplitudes Sp. We have used the best fit 
values of Sp that we obtain later on in this Section. However, 
the width of the lines does not indicate the error on the fit: 
the lines are intended merely to guide the eye.) On small 
scales (large variances), hierarchical scaling is followed. On 
intermediate and large scales, for which the variance drops 
below ~ 1.3, the measured moments depart somewhat from 
the hierarchical scaling behaviour, particularly in the case 
of the higher orders. 

The hierarchical scaling of the higher order correla- 
tion functions is exploited to plot the hierarchical ampli- 
tudes Sp — (,p/(,2^^ as a function of cell radius in Fig. |S| 
Each panel corresponds to a different volume limited sample, 
where the lines and points correspond to S3, S4 and S5 in 
order of increasing amplitude. The hierarchical amplitudes 
measured from the two brightest volume limited samples 
systematically show an increase around 10/i~^Mpc. This ef- 
fect is particularly significant in the — 19 > Mbj — Slog^o h > 
—20 sample, with the Sp increasing by a factor of 2 to 5 de- 
pending on p. On smaller scales the hierarchical amplitudes 
are essentially independent of the cell radius for all magni- 
tude ranges considered. It should be noted that the Sp mea- 
sured in real space vary more strongly in amplitude with 
scale than in redshift space, particularly at small cell radii 
(Gaztaiiaga 1994; Szapudi et al. 1995, Szapudi & Gaztanaga 
1998). 

We have fit constant values to the measured Sp, using 
the principal component analysis outlined in Section 3.3. 
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Figure 6. The hierarchical amphtudes, Sp, for p = 3,4 and 5, plotted as a function of cell radius for the galaxy samples defined in 
Table 1. Each panel shows the results for a different volume limited catalogue, as indicated by the legend. The points with errorbars 
show the results obtained from the full volume limited samples: triangles show S3, squares show 54 and pentagons show 55. The solid 
lines show the best fit values and the dotted lines indicate the Icr errors on the fits, as described in the text. The lines are plotted over 
the range of scales used in the fits. 



This approach takes into account the correlations between 
the measurements on different scales. The range of scales 
used to fit Sp is held fixed for each volume limited sample 
and is quoted in Table 2. Typically, there are ten values of 
Sp in the range considered in the fits. The principal com- 
ponent analysis reveals that just 2 — 4 linear combinations 
of these points account for more than 90% of the variance; 
this gives a fairer impression of the number of independent 
data points. The principal eigenvector is in all cases almost 



independent of scale, i.e. its effect is to move all the points 
coherently up and down (driven by large scale variation in 
the mean density estimated from the survey). Therefore, 
the best fitting constant tends to favour a fit either slightly 
above or below each set of data points. This is exactly what 
is seen in the various panels of Fig. 6. The best fit constants 
to the measured Sp are given in Table 2, along with an error 
from the principal component analysis. The fits to S3 and 
Si for the —19 > Aff,, — Slogj^g h > —20 sample are poor 
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in terms of the reduced x^- There some dependence of the 
Sp with increasing luminosity. This behaviour is explored in 
Section 5. 



4.3 Systematic effects: the influence of 
superclusters 

The higher order moments of the CPDF are sensitive to 
the presence of massive structures that contribute to the 
extreme event tail of the count distribution, ft is therefore 
important to examine the 2dFGRS to look for any rare large 
scale structures that could exert a significant influence on 
the form of the CPDF. The projected density of galaxies 
in the right ascension-redshift plane for a volume limited 
catalogue defined by the magnitude range —19 > Mb, — 
51ogjQ h > —20 is plotted in figure 1 of Baugh et al. (2004). 
There are two clear hot spots or superstructures apparent 
in this figure, one in the NGP at a redshift of z = 0.08 and a 
right ascension of 3.4 hours, and the other in the SGP sd z = 
0.11 at a right ascension of 0.2 hours. These structures are 
confirmed as superclusters of galaxies in the group catalogue 
constructed from the flux Umited 2dFGRS (Eke et al. 2004); 
of the 94 groups in the full flux limited survey out to z ~ 0.15 
with 9 or more members and estimated masses above 5 x 
10^'^h~^ Mq, 20% reside in these superclusters. As a result of 
the redshift at which these superclusters lie, these structures 
are only influential in volume limited samples brighter than 
Mb, -51ogio/i = -18. 

The results presented earlier in this Section show fea- 
tures that could be due to the presence of these superclus- 
ters. For example, the volume averaged correlation functions 
for the -19 > Mi,, -51ogig h > -20 sample plotted in Fig.O 
appear to have more power on large scales than those mea- 
sured from the other volume limited samples. This is con- 
sistent with the theoretical expectations for measurements 
that are strongly affected by the presence of a supercluster: 
a boost in the clustering amplitude on large scales, due to a 
structure with a larger bias, and a reduction in the cluster- 
ing amplitude on small scales arising from the large velocity 
dispersions within the clusters making up the structure. 

To investigate this hypothesis, we have carried out the 
test of removing the two superclusters from the sample and 
recomputing the volume averaged correlation functions. The 
goal of this exercise is not to "correct" the measured corre- 
lation functions but rather to illustrate the impact of the 
superclusters on our results. We remove the superclusters 
by masking out their central densest regions, corresponding 
to prohibiting the placement of cells within a sphere of ra- 
dius 25/i^^Mpc from each supercluster centre (for a different 
approach on how to take this type of effect into account see 
Colombi, Bouchet & Schaeffer 1994 and Fry & Gaztafiaga 
1994). 

Fig. Q shows the effect of the supercluster removal on 
the tail of the CPDF for 10/i~^Mpc radius cells, calculated 
for three volume limited catalogues centred on L* . The 
mean number of galaxies in a cell for each galaxy sample 
is roughly 40, 24, and 6 going from faintest to brightest. 
The presence of the two superclusters makes a clear differ- 
ence to the high A'^ counts for galaxy samples brighter than 
Mb, — Slogj^Q h = —19. The maximum redshift of the faint 
volume limited catalogue in this figure only marginally in- 
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Figure 7. The probability, Pjv, of finding exactly galaxies in 
randomly placed cells of radius 10/t~^Mpc (the CPDF, Eq. 2), 
for different volume limited galaxy samples. Each bold line shows 
the full volume CPDF, while the individual dotted lines give the 
result after the supercluster regions have been omitted from the 
analysis, as described in Section 4.3. 

eludes the NGP supercluster, and so Pjv remains essentially 
unaffected in this case. 

Fig. |H] shows volume averaged correlation functions of 
order p — 2, 3, 4 for three volume limited catalogues from 
Table 1, where each panel corresponds to a fixed absolute 
magnitude range. The lines correspond to different orders 
of clustering, starting with the lowest in amplitude, the 
two point volume averaged correlation function, and mov- 
ing through to the four point function, at which we stop 
plotting the results for clarity although the trends shown 
continue up to sixth order. The solid curves show the corre- 
lation functions measured from the full volume limited sam- 
ples, as shown previously in Fig. 6, and the dashed lines show 
the results when the regions containing the superclusters are 
excluded from the CPDF. The higher order correlation func- 
tions are systematically boosted on intermediate and large 
scales when the superclusters are included in the analysis. 
The precise scale on which the correlation functions become 
sensitive to the presence of the superclusters depends upon 
the order; for the four point function, the two estimates of 
the correlation function typically deviate for cells of radius 
3^~^Mpc and larger. 

The impact on the hierarchical amplitudes, Sp, of re- 
moving the superclusters in shown in Fig. |^ in which we 
plot the results for the volume limited sample defined by 
-19 > Mbj - Slog^g/i > -20. In Fig. |51 the open points 
show the hierarchical amplitudes measured from the full 
volume limited sample. The filled symbols show the results 
obtained from the same volume but with the supercluster 
regions masked out. The Sp obtained when the two super- 
clusters are removed from the analysis are much closer to be- 
ing independent of cell size. The sensitivity of higher orders 
to rare peaks has been noticed in earlier analyses of galaxy 
surveys (Groth & Peebles 1977; Gaztafiaga 1992; Bouchet 
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Figure 8. The volume averaged correlation functions for p = 2 
to 4, with each panel showing the results from a different volume 
limited sample, as indicated by the legend. The solid lines show 
the estimates from the full volumes and the dashed lines show 
the results when the supercluster regions are omitted from the 
analysis. For clarity, errorbars are only plotted on the solid curves 
for order p = 4. 
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Figure 9. The hierarchical amplitudes, (triangles), 54 
(squares) and (pentagons). The top panel corresponds to 
galaxies with —19 > Mj,j — Slogj^p/i > —20 and the bottom 
panel to —20 > — SlogjQ h > —21. The open symbols with 
errorbars show the results obtained using the full volume lim- 
ited catalogues. The filled symbols show how the results change 
when regions containing the two superclusters are omitted from 
the analysis. 



et al. 1993; Lahav et al. 1993; ; Gaztafiaga 1994; Hoyle et 
al. 2000). 



5 INTERPRETATION AND THE 

IMPLICATIONS FOR GALAXY BIAS 

In this Section we quantify how the hierarchical amplitudes 
scale with galaxy luminosity and discuss the implications of 
our results for simple models of galaxy bias. We first test the 
hypothesis set out in Section 2.4 that the variation in clus- 
tering with luminosity apparent in Fig. |3 can be described 
by a single, relative bias factor, as defined by Eq. 1141 The 
relative bias factors, br, computed from the variance and 
the deviation from the linear bias model, as quantified by 
C2 (Eq. I17|l . are listed in Table 2; here the mean value is 
given by the best fit over all cell radii. The change in 
the amplitude of the relative bias with sample luminosity, 
shown in Fig. 4, is in excellent agreement with the trend 
found by Norberg et al. (2001), who analysed the projected 
spatial clustering of 2dFGRS galaxies. This agreement is re- 
markable given the different approaches used to measure the 
two-point correlations and the fact that the analysis in this 
paper is in redshift space, whereas the study carried out by 
Norberg et al. was unaffected by peculiar motions. 

The coefficients c'2 are different from zero at a 1-a level. 
These findings are consistent with a small deviation from the 
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Figure 10. The variation of the hierarchical amplitudes, Sp, 
with absolute magnitude. The points are plotted at the median 
magnitude of each volume limited sample and the horizontal bars 
indicate the interval in which 25% to 75% of the galaxies fall, 
computed using the 2dFGRS luminosity function fit quoted by 
Norberg et al. (2002b). Each panel shows the results for a different 
order of clustering. The dotted line shows predictions of the linear 
relative bias model for the variation of the Sp with luminosity 
(Eq. 15). The solid lines show linear fits in log luminosity to the 
observed trend in the value of Sp with sample luminosity (see text 
of Section 5 for details). 



linear biasing model (at a 2-sigma level for the brighter sam- 
ples). This is in qualitative agreement with the estimation 
of C2 using the the bispectrum (Scoccimarro 2000, Verde et 
al. 2002) or the 3-point function measured from the parent 
APM galaxy survey (Frieman & Gaztanaga 1999). 

The variation of the hierarchical amplitudes with lu- 



minosity is plotted in Fig. 1101 Each panel corresponds to 
a diflterent order p. The filled points show the hierarchical 
amplitudes averaged over the different cell radii employed 
(these values and the associated errors are given in Table 2). 
The dotted line shows the hierarchical amplitudes predicted 
by the linear relative bias model (Eq. 15), using the best fit 
bias factors stated in Table 2. This model gives a rough ap- 
proximation to the data. However, the observed variation of 
Sp with luminosity is somewhat better described by a linear 
fit in the logarithm of luminosity, as shown by the solid lines. 
This implies that the dependence of the hierarchical ampli- 
tudes on luminosity is more complicated than expected in 
the simple relative bias model of Ea. ll5l (as does the fact that 
we find some evidence for non-zero values for C2). The solid 
lines show the best linear fit to the hierarchical amplitudes 
as a function of the logarithm of the median luminosity of 
the samples; 



S"; =Ap + Bp login ( ^ 



(18) 



We find a greater than 2 — a (Ax > 7.2 for two parameters) 
detection of a non-zero value for B3. However, for p > 3, the 
constraints on Bp are much weaker and there is no clear 
evidence for a luminosity dependence in the Sp values in 
these cases. For completeness, the best fit values for each or- 
der are: (^3,-83) = (2.07,-0.40), (^4,^4) = (6.15,-2.51), 
(^5, B5) = (21.3, -13.5), (As, Be) = (58, -39). 



6 CONCLUSIONS 

In this paper we have measured the higher order correla- 
tion functions of galaxies in volume limited samples drawn 
from the 2dFGRS. The most recent comparable work is the 
analysis of the Stromlo-APM and UKST redshift surveys by 
Hoyle, Szapudi & Baugh (2000). These authors also consid- 
ered volume limited subsamples drawn from the fiux lim- 
ited redshift survey. The largest UKST sample considered 
by Hoyle et al. contained 500 galaxies and covered a volume 
of 9 X 10^/i~'^Mpc'^; the reference sample used in our work 
contains 90 times this number of galaxies and covers ten 
times the volume. In our analysis, we can follow the vari- 
ation of clustering over more than a decade in luminosity, 
whereas Hoyle et al. had to focus their attention around L, . 

The measurement of the higher order galaxy correlation 
functions is still challenging, however. In spite of the order 
of magnitude increase in size that the 2dFGRS represents 
over previously completed surveys, we have found that the 
higher order moments that we measure are somewhat sensi- 
tive to the presence of large structures. In particular, there 
are two superclusters that influence our measurements, one 
in the SGP region and the other in the NGP. These struc- 
tures contain a sizeable fraction of the cluster mass groups 
in the 2dFGRS (Eke et al. 2004). The inclusion of these 
structures has an impact on our estimates of the three point 
and higher order volume averaged correlation functions on 
scales around 4 — 10/i~^Mpc and above, depending on the 
order of the correlation function. For this reason, we have 
presented measurements of the higher order correlation func- 
tions both with and without these structures. We stress that 
the removal of these superclusters should not be considered 
a correction to the full catalogue results, but rather as an 
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indication of the impact of rare structures on our results for 
the higher order moments. On the other hand, the up-turn 
that we find in the values of the hierarchical amplitudes on 
large scales is predicted by some structure formation models; 
for example models with non-Gaussian initial density fields 
predict a similar form for the Sp as we measure from the 
full volume limited samples (Gaztanaga & Mahonen 1996; 
Gaztaiiaga & Fosalba 1998; Bernardeau et al. 2002). 

The difficulties in estimating Sp values on large, quasi- 
linear scales (> 10/i~^Mpc), prevent a direct comparison 
with perturbation theory (see Bernardeau et al. 2002). The 
current best estimates on these scales are still those mea- 
sured from the angular APM Galaxy Survey (Gaztaiiaga 
1994, Szapudi et al. 1995, Szapudi & Gaztaiiaga 1998). At 
the time of writing, the results from the SDSS Early Data 
Release are still limited to small scales (Gaztaiiaga 2002, 
Szapudi et al. 2002) . Despite being unable to make a robust 
measurement of the higher order correlation functions on the 
very large scales for which weakly non-linear perturbation 
theory is applicable, we are still able to reach a number of 
interesting conclusions: 

(i) We have demonstrated that the higher order galaxy cor- 
relation functions measured from the 2dFGRS follow a hier- 
archical scaling. Baugh et al. (2004) showed that L, galaxies 
display higher order correlation functions that scale in a hi- 
erarchical fashion; we have extended these authors' analysis 
to cover a wide range of galaxy luminosity. The higher order 
moments of the galaxy count distribution are proportional 
to the variance raised to a power that depends upon the 
order of the correlation function under consideration. This 
behaviour holds on physical scales ranging from those on 
which we expect the underlying density fluctuations to be 
strongly nonlinear all the way through to quasi-linear scales. 
This scaling has been tested up to the six point correlation 
function for the first time using a redshift survey. This con- 
firms the conclusions of a complementary analysis carried 
out by Groton et al. (2004), who found hierarchical scaling 
when measuring the reduced void probability function of the 
2dFGRS. 

(ii) We have estimated values of the hierarchical ampli- 
tudes, Sp = Cp/Cl"^' ^'^^ cells of different radii. The hier- 
archical amplitudes are approximately constant on small to 
medium scales (depending on the order considered), while 
for the larger volumes, Sp seem to increase with radius at 
large scales. Although this could in principle result from 
a boundary or mask effect (e.g. see Szapudi & Gaztaiiaga 
1998; Bernardeau et al. 2002), we have shown with mock cat- 
alogues that this is not the case here (e.g. see Fig. 1). If the 
two most massive superclusters in the survey are removed 
from the analysis, the hierarchical amplitudes are remark- 
ably independent over all scales. That the Sp are roughly 
constant on small scales, with smaller amplitudes than in 
real space (e.g. Gaztaiiaga 1994), has been noted before for 
measurements in redshift space. It arises due to a cancel- 
lation of the enhanced signal on small scales in real space 
by a damping of clustering in redshift space due to peculiar 
motions (Lahav et al. 1993; Fry & Gaztaiiaga 1994; Hivon 
et al. 1995; Hoyle et al. 2000; Bernardeau et al. 2002). 

(iii) We find that the amplitude of the higher order correla- 
tion functions scales with luminosity. The magnitude of the 
luminosity segregation increases with the order of the corre- 



lation (see Fig. 4). For the variance, ^2, the strength of the 
trend is in very good agreement with that reported by Nor- 
berg et al. (2001), but note that these authors measured the 
luminosity segregation in real space, whereas our results are 
in redshift space. The strength of the luminosity segregation 
for higher orders can be mostly explained as the result of hi- 
erarchical scaling ~ Cf"^' that most of the effect can 
be attributed to luminosity segregation in the variance. This 
can be seen in Fig. 5 where data from different luminosities 
trace out the same hierarchical curve with little scatter, 
(iv) We find some evidence for a residual dependence of Sp 
on luminosity, although the effect is only significant within 
the errors for the skewness p = 3 (greater than 2a level). 
It is not clear whether this is driven by a pure luminosity 
dependence of the higher order clustering or by a change in 
the galaxy mix with luminosity, with different galaxy types 
having different Sp or by a combination of the two effects: 
see Norberg et al. (2002a) for an investigation of this point 
for the 2-point correlation function. A simple linear relative 
bias model (dotted line in Fig. 10) does not reproduce the 
dependence of the Sp on luminosity. 

We have interpreted our results in terms of a simple, 
local bias model, and we have quantified trends in cluster- 
ing amplitude with luminosity by estimating relative bias 
factors. These measurements, summarised in Table 2, ex- 
tend the constraints upon models of galaxy formation de- 
rived from the two-point correlation function, quantifying 
the shape of the tails of the count probability distribution 
as well as its width. 
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